85 research outputs found

    An Optimal k Nearest Neighbours Ensemble for Classification Based on Extended Neighbourhood Rule with Features subspace

    Full text link
    To minimize the effect of outliers, kNN ensembles identify a set of closest observations to a new sample point to estimate its unknown class by using majority voting in the labels of the training instances in the neighbourhood. Ordinary kNN based procedures determine k closest training observations in the neighbourhood region (enclosed by a sphere) by using a distance formula. The k nearest neighbours procedure may not work in a situation where sample points in the test data follow the pattern of the nearest observations that lie on a certain path not contained in the given sphere of nearest neighbours. Furthermore, these methods combine hundreds of base kNN learners and many of them might have high classification errors thereby resulting in poor ensembles. To overcome these problems, an optimal extended neighbourhood rule based ensemble is proposed where the neighbours are determined in k steps. It starts from the first nearest sample point to the unseen observation. The second nearest data point is identified that is closest to the previously selected data point. This process is continued until the required number of the k observations are obtained. Each base model in the ensemble is constructed on a bootstrap sample in conjunction with a random subset of features. After building a sufficiently large number of base models, the optimal models are then selected based on their performance on out-of-bag (OOB) data.Comment: 12 page

    Neotectonic Activity in Quetta-Ziarat Region, Northwest Quetta City, Pakistan

    Get PDF
    Geomorphic parameters are very helpful as they can quickly explain the concerned area, which is going through a tectonic adjustment. For this purpose, four indices were applied to examine the active tectonics in the QuettaZiarat region. These indices include: sub-basins asymmetry (Af), transverse topography (T-Factor), hypsometric integral (HI) and stream-length gradient (SL). The calculation of the three indices as denoted by Af, HI and SL show low active tectonics, whereas T-Factor suggests moderate to high level of tectonic activity. While index of active tectonics (IAT) indicated a low to moderate level of active tectonics. In addition, these indices are compared with lithological and climatic consequences to detect the final neotectonics judgement

    Reduced reference image and video quality assessments: review of methods

    Get PDF
    With the growing demand for image and video-based applications, the requirements of consistent quality assessment metrics of image and video have increased. Different approaches have been proposed in the literature to estimate the perceptual quality of images and videos. These approaches can be divided into three main categories; full reference (FR), reduced reference (RR) and no-reference (NR). In RR methods, instead of providing the original image or video as a reference, we need to provide certain features (i.e., texture, edges, etc.) of the original image or video for quality assessment. During the last decade, RR-based quality assessment has been a popular research area for a variety of applications such as social media, online games, and video streaming. In this paper, we present review and classification of the latest research work on RR-based image and video quality assessment. We have also summarized different databases used in the field of 2D and 3D image and video quality assessment. This paper would be helpful for specialists and researchers to stay well-informed about recent progress of RR-based image and video quality assessment. The review and classification presented in this paper will also be useful to gain understanding of multimedia quality assessment and state-of-the-art approaches used for the analysis. In addition, it will help the reader select appropriate quality assessment methods and parameters for their respective applications

    Urdu Handwritten Characters Data Visualization and Recognition Using Distributed Stochastic Neighborhood Embedding and Deep Network

    Get PDF
    This study was supported by the China University of Petroleum-Beijing and Fundamental Research Funds for Central Universities under Grant no. 2462020YJRC001.Peer reviewedPublisher PD

    Morphological lodge of desi cotton (Gossypium arboreum L.) genotypes and stage-manage by planting log under dry tropical prospect

    Get PDF
    Planting log is the most considerable factor which directly manipulates the plant traits under naturally prevailing environment. The aim of the trial was to ensure the influence of planting hiatus on the morphological cabin of Desi cotton (Gossypium arboreum L.) varieties under dry tropical coast. The research was carried out during 2016 on three desi cotton genotypes C1 (FDH-512), C2 (FDH-502), C3 (FDH-170) under three-fortnightsowing regimes (S1 = 15. March, S2 = 1. April and S3 = 15. April) at agronomy research area in the Lasbela University of Agriculture, Water and Marine Science, Uthal, Lasbela, Pakistan. Momentous results were originated for different morphological traits according to the arid environments. Significant results were observed for traits i.e.; number of monopodial branches, number of sympodial branches, number of capsule per plant, number of seeds per capsule, number of locules per capsule, number of seeds per locules, weight of seed per capsule, seed colour, seed yield per plant, lint percentage, root shoot ratio (%), root depth (cm) for various sowing dates and desi cotton varieties. Results of the traits like i.e. the number of locules and per capsule, a number of seeds per locules was yielded completely non-significant outcomes both for the diverse sowing period and desi cotton genotypes. The interaction between the both factors was found to be non-significant in all traits. The correlation amongst cotton individual characteristics was observed, it was found that capsules per plant and lint percentage, monopodial branches per plant, root shoot ratio, root depth, seed weight per capsule and seed yield per plant were significantly and positively correlated. The seed yield and lint percentage was also significantly correlated, which showed that selection may be positive responsive in sense of lint percentage, monopodial branches, seed yield per plant, capsules per plant and seed weight per capsule to get a superior yield of cotton. Under the existing dry climatic condition, it was found that the finest planting window of 15. April for the desi cotton FDH-170 is most suitable for its cultivation

    Short-Term Prediction of COVID-19 Using Novel Hybrid Ensemble Empirical Mode Decomposition and Error Trend Seasonal Model

    Get PDF
    In this article, a new hybrid time series model is proposed to predict COVID-19 daily confirmed cases and deaths. Due to the variations and complexity in the data, it is very difficult to predict its future trajectory using linear time series or mathematical models. In this research article, a novel hybrid ensemble empirical mode decomposition and error trend seasonal (EEMD-ETS) model has been developed to forecast the COVID-19 pandemic. The proposed hybrid model decomposes the complex, nonlinear, and nonstationary data into different intrinsic mode functions (IMFs) from low to high frequencies, and a single monotone residue by applying EEMD. The stationarity of each IMF component is checked with the help of the augmented Dicky–Fuller (ADF) test and is then used to build up the EEMD-ETS model, and finally, future predictions have been obtained from the proposed hybrid model. For illustration purposes and to check the performance of the proposed model, four datasets of daily confirmed cases and deaths from COVID-19 in Italy, Germany, the United Kingdom (UK), and France have been used. Similarly, four different statistical metrics, i.e., root mean square error (RMSE), symmetric mean absolute parentage error (sMAPE), mean absolute error (MAE), and mean absolute percentage error (MAPE) have been used for a comparison of different time series models. It is evident from the results that the proposed hybrid EEMD-ETS model outperforms the other time series and machine learning models. Hence, it is worthy to be used as an effective model for the prediction of COVID-19

    Multisource Data Fusion Framework for Land Use/Land Cover Classification Using Machine Vision

    Get PDF
    Data fusion is a powerful tool for the merging of multiple sources of information to produce a better output as compared to individual source. This study describes the data fusion of five land use/cover types, that is, bare land, fertile cultivated land, desert rangeland, green pasture, and Sutlej basin river land derived from remote sensing. A novel framework for multispectral and texture feature based data fusion is designed to identify the land use/land cover data types correctly. Multispectral data is obtained using a multispectral radiometer, while digital camera is used for image dataset. It has been observed that each image contained 229 texture features, while 30 optimized texture features data for each image has been obtained by joining together three features selection techniques, that is, Fisher, Probability of Error plus Average Correlation, and Mutual Information. This 30-optimized-texture-feature dataset is merged with five-spectral-feature dataset to build the fused dataset. A comparison is performed among texture, multispectral, and fused dataset using machine vision classifiers. It has been observed that fused dataset outperformed individually both datasets. The overall accuracy acquired using multilayer perceptron for texture data, multispectral data, and fused data was 96.67%, 97.60%, and 99.60%, respectively

    Forecasting COVID-19 in Pakistan.

    No full text
    ObjectivesForecasting epidemics like COVID-19 is of crucial importance, it will not only help the governments but also, the medical practitioners to know the future trajectory of the spread, which might help them with the best possible treatments, precautionary measures and protections. In this study, the popular autoregressive integrated moving average (ARIMA) will be used to forecast the cumulative number of confirmed, recovered cases, and the number of deaths in Pakistan from COVID-19 spanning June 25, 2020 to July 04, 2020 (10 days ahead forecast).MethodsTo meet the desire objectives, data for this study have been taken from the Ministry of National Health Service of Pakistan's website from February 27, 2020 to June 24, 2020. Two different ARIMA models will be used to obtain the next 10 days ahead point and 95% interval forecast of the cumulative confirmed cases, recovered cases, and deaths. Statistical software, RStudio, with "forecast", "ggplot2", "tseries", and "seasonal" packages have been used for data analysis.ResultsThe forecasted cumulative confirmed cases, recovered, and the number of deaths up to July 04, 2020 are 231239 with a 95% prediction interval of (219648, 242832), 111616 with a prediction interval of (101063, 122168), and 5043 with a 95% prediction interval of (4791, 5295) respectively. Statistical measures i.e. root mean square error (RMSE) and mean absolute error (MAE) are used for model accuracy. It is evident from the analysis results that the ARIMA and seasonal ARIMA model is better than the other time series models in terms of forecasting accuracy and hence recommended to be used for forecasting epidemics like COVID-19.ConclusionIt is concluded from this study that the forecasting accuracy of ARIMA models in terms of RMSE, and MAE are better than the other time series models, and therefore could be considered a good forecasting tool in forecasting the spread, recoveries, and deaths from the current outbreak of COVID-19. Besides, this study can also help the decision-makers in developing short-term strategies with regards to the current number of disease occurrences until an appropriate medication is developed
    corecore